Equalizing imbalanced imprecise datasets for genetic fuzzy classifiers

نویسندگان

  • Ana M. Palacios
  • Luciano Sánchez
  • Inés Couso
چکیده

Determining whether an imprecise dataset is imbalanced is not immediate. The vagueness in the data causes that the prior probabilities of the classes are not precisely known, and therefore the degree of imbalance can also be uncertain. In this paper we propose suitable extensions of different resampling algorithms that can be applied to interval valued, multi-labelled data. By means of these extended preprocessing algorithms, certain classification systems designed for minimizing the fraction of misclassifications are able to produce knowledge bases that are also adequate under common metrics for imbalanced classification.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارائه‌روش جدید مبتنی‌بر برنامه‌نویسی ژنتیک برای وزن‌دهی قوانین فازی در طبقه‌بندی نامتوازن

In classification problems, we often encounter datasets with different percentage of patterns (i.e. classes with a high pattern percentage and classes with a low pattern percentage). These problems are called “classification Problems with imbalanced data-sets”. Fuzzy rule based classification systems are the most popular fuzzy modeling systems used in pattern classification problems. Rule weights...

متن کامل

Proposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms

In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...

متن کامل

Combining Adaboost with Preprocessing Algorithms for Extracting Fuzzy Rules from Low Quality Data in Possibly Imbalanced Problems

An extension of the Adaboost algorithm for obtaining fuzzy rule-based systems from low quality data is combined with preprocessing algorithms for equalizing imbalanced datasets. With the help of synthetic and real-world problems, it is shown that the performance of the Adaboost algorithm is degraded in presence of a moderate uncertainty in either the input or the output values. It is also estab...

متن کامل

First study of the behaviour of genetic fuzzy classifier based on low quality data respect to the preprocessing of low quality imbalanced datasets

There are real-world dataset where we can found classes with a very different percentage of patterns between them, that is to say we have classes represented by many examples (high percentage of patterns) and classes represented by few examples (low percentage of patterns). These kind of datasets receive the name of “imbalanced datasets”. In the field of classification problems the imbalanced d...

متن کامل

Multi-objective genetic fuzzy classifiers for imbalanced and cost-sensitive datasets

We exploit an evolutionary three-objective optimization algorithm to produce a Pareto front approximation composed of fuzzy rule-based classifiers (FRBCs) with different trade-offs between accuracy (expressed in terms of sensitivity and specificity) and complexity (computed as sum of the conditions in the antecedents of the classifier rules). Then, we use the ROC convex hull method to select th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Int. J. Computational Intelligence Systems

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2012